ANGULAR SYNCHRONIZATION BY EIGENVECTORS AND 
SEMIDEFINITE PROGRAMMING 

A. SINGER* 

Abstract. The angular synchronization problem is to obtain an accurate estimation (up to 
a constant additive phase) for a set of unknown angles d\, . . . ,9n from m noisy measurements of 
their offsets 9i — 8j mod 2it. Of particular interest is angle recovery in the presence of many outlier 
measurements that are uniformly distributed in [0, 2tt) and carry no information on the true offsets. 
We introduce an efficient recovery algorithm for the unknowrn angles from the top eigenvector of a 
specially designed Hermitian matrix. The eigenvector method is extremely stable and succeeds even 
when the number of outliers is exceedingly large. For example, we successfully estimate n = 400 
angles from a full set of m = (^j") offset measurements of which 90% are outliers in less than a 
second on a commercial laptop. The performance of the method is analyzed using random matrix 
theory and information theory. We discuss the relation of the synchronization problem to the com- 
binatorial optimization problem Max-2-Lin mod L and present a semidefinite relaxation for angle 
recovery, drawing similarities with the Goemans- Williamson algorithm for finding the maximum cut 
in a weighted graph. We present extensions of the eigenvector method to other synchronization prob- 
lems that involve different group structures and their applications, such as the time synchronization 
problem in distributed networks and the surface reconstruction problems in computer vision and 
optics. 

1. Introduction. The angular synchronization problem is to estimate n un- 
known angles 9i, . . . ,9n S [0, 2tt) from m noisy measurements 6ij of their offsets 9i — 9j 
mod 27r. In general, only a subset of all possible (2) offsets are measured. The set E 
of pairs for which offset measurements exist can be realized as the edge set of 

a graph G = (V, E) with vertices corresponding to angles and edges corresponding to 
measurements . 

When all offset measurements are exact with zero measurement error, it is possible 
to solve the angular synchronization problem iff the graph G is connected. Indeed, 
if G is connected then it contains a spanning tree and all angles are sequentially 
determined by traversing the tree while summing the offsets modulo 27r. The angles 
are uniquely determined up to an additive phase, e.g.. the angle of the root. On the 
other hand, if G is disconnected then it is impossible to determine the offset between 
angles that belong to disjoint components of the graph. 

Sequential algorithms that integrate the measured offsets over a particular span- 
ning tree of the graph are very sensitive to measurement errors, due to accumulation 
of the errors. It is therefore desirable to integrate all offset measurements in a globally 
consistent way. The need for such a globally consistent integration method comes up 
in a variety of applications. One such application is the time synchronization of dis- 
tributed networks [171 US] , where clocks measure noisy time offsets ti — tj from which 
the determination of ti, . . . , t„ G M is required. Other applications include the surface 
reconstruction problems in computer vision [13|, [1] and optics |30j , where the surface 
is to be reconstructed from noisy measurements of the gradient to the surface and 
the graph of measurements is typically the two-dimensional regular grid. The most 
common approach in the above mentioned applications for a self consistent global 
integration is the least squares approach. The least squares solution is most suitable 
when the offset measurements have a small Gaussian additive error. The least squares 
solution can be efficiently computed and also mathematically analyzed in terms of the 
Laplacian of the underlying measurement graph. 
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There are many possible models for the measurement errors, and we are mainly 
interested in models that allow many outliers. An outlier is an offset measurement 
that has a uniform distribution on [0, 2tt) regardless of the true value for the offset. In 
addition to outliers that carry no information on the true angle values, there also exist 
of eourse good measurements whose errors are relatively small. We have no a-priori 
knowledge, however, which measurements are good and which are bad (outliers). 

In our model, the edges of E can be split into a set of good edges Egood and a set of 
bad edges E^^ad, of sizes nigood and m^ad respectively (with m = \E\^ mgood + rribad), 
such that 

Sij=0i-9j for {i,j} e Egood Q 

Sij Uniform {[0,2Tr)) ioi {i,j} £ Ebad 

Perhaps it would be more realistic to allow a small discretization error for the good 
offsets, for example, by letting them have the wrapped normal distribution on the 
circle with mean 9—9 j and variance cr^ (where cr is a typical discretization error). This 
discretization error can be incorporated into the mathematical analysis of Section [5] 
with a little extra difficulty. However, the effect of the discretization error is negligible 
compared to that of the outliers, so we choose to ignore it in order to make the 
presentation as simple as possible. 

It is trivial to find a solution to if some oracle whispers to our ears which 
equations are good and which are bad (in fact, all we need in that case is that Egood 
contains a spanning tree of G). In reality, we have to be able to tell the good from 
the bad on our own. 

The overdetermined system of linear equations (modulo 27r) 

9, - 9j = (5,y mod 27r, for {i,j} e E (1.2) 

can be solved by the method of least squares as follows. Introducing the complex- 
valued variables Zi — e'^' , the system (|1.2p is equivalent to 

Zi - e'^'^ Zj ^ 0, for{i,j}G-B, (1.3) 

which is an overdetermined system of homogeneous linear equations over C. To pre- 
vent the solution from collapsing to the trivial solution Zi = Z2 = ■ ■ ■ ~ Zn ~ 0, we 
set zi = 1 (recall that the angles are determined up to a global additive phase, so 
we may choose 9i = 0), and look for the solution Z2, . . . , Zn of (jl.3p with minimal 
^2-norm residual. However, it is expected that the sum of squares errors would be 
overwhelmingly dominated by outlier equations, making least squares least favorable 
to succeed if the proportion of bad equations is large (see numerical results involving 
least squares in Table |4?3)) . We therefore seek for a solution method which is more 
robust to outliers. 

Maximum likelihood is an obvious step in that direction. The maximum likelihood 
solution to p.ip is simply the set of angles 9i, . . . , 9„ that satisfies as many equations 
of (|1.2p as possible. We may therefore define the self consistency error (SCE) of 
9i, . . . ,9n as the number of equations not being satisfied 

SCEi9u . . . , 9„) = j} eE:9,-9j^ 5,, mod 27r}. (1.4) 

As even the good equations contain some error (due to angular discretization and 
noise), a more suitable self consistency error is SCEf that incorporates some penalty 
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function / 



scEf{e^,...,e.^)^ f{e,-6j-5,j), (1.5) 

where / : [0,27r) ^ K is a smooth periodic function with /(O) = and J {9) = 1 for 
\9\ > 6*0, where 6*0 is the allowed discretization error. The minimization of (|1.5p is 
equivalent to maximizing the log likelihood with a different probabilistic error model. 

The maximum likelihood approach suffers from a major drawback though. It is 
virtually impossible to find the global minimizer 0i , . . . , 6'„ when dealing with large 
scale problems (n ^ 1), because the minimization of either (|1.4p or (|1.5p is a non- 
convex optimization problem in a huge parameter space. It is like finding a needle in 
a haystack. 

In this paper we take a different approach and introduce two different estimators 
for the angles. The first estimator is based on an eigenvector computation while the 
second estimator is based on a semidefinite program (SDP) |38] . Our eigenvector 
estimator ^i, . . . is obtained by the following two-step recipe . In the first step, 
we construct an n x n complex- valued matrix H whose entries are 

^^^-[ {^,J}^E ' (1-6) 

where i = The matrix H is Hermitian, i.e. Hij — Hji, because the offsets are 

skew-symmetric dij = ~Sji mod 2tt. As H is Hermitian, its eigenvalues are real. The 
second step is to compute the top eigenvector vi of H with maximal eigenvalue, and 
to define the estimator in terms of this top eigenvector as 

^-l,...,n. (1.7) 



The philosophy leading to the eigenvector method is explained in Section [2l 
The second estimator is based on the following SDP 

max trace(ife) (1.8) 

s.t.QhO (1.9) 
e,, = 1 t = l,2,...,n, (1.10) 

where ^ is a shorthand notation for O being a Hermitian semidefinite positive 
matrix. The only difference between this SDP and the Goemans- Williamson algo- 
rithm for finding the maximum cut in a weighted graph |18j is that the maximization 
is taken over all semidefinite positive Hermitian matrices with complex-valued en- 
tries rather than just the real-valued symmetric matrices. The SDP-based estimator 
01, . . . ,9n is derived from the normalized top eigenvector vi of Q by the same rounding 
procedure (|1.7p . Our numerical experiments show that the accuracy of the eigenvector 
method and the SDP method are comparable. Since the eigenvector method is much 
faster, we prefer using it for large scale problems. The eigenvector method is also 
numerically appealing, because in the useful case the spectral gap is large, rendering 
the simple power method an efficient and numerically stable way of computing the 
top eigenvector. The SDP method is summarized in Section [3l 

In Section 0] we use random matrix theory to analyze the eigenvector method for 
two different measurement graphs: the complete graph and "small- world" graphs [39j . 
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Our analysis shows that the top eigenvector of H in the complete graph case has a 
non-trivial correlation with the vector of true angles as soon as the proportion p of 
good offset measurements becomes greater than In particular, the correlation 

goes to 1 as np^ — oo, meaning a successful recovery of the angles. Our numerical 
simulations confirm these results and demonstrate the robustness of the estimator 
(fTTI) to outliers. 

In Section O we prove that the eigenvector method is asymptotically nearly op- 
timal in the sense that it achieves the information theoretic Shannon bound up to a 
multiplicative factor that depends only on the discretization error of the measurements 
27r/L, but not on m and n. In other words, no method whatsoever can accurately es- 
timate the angles if the proportion of good measurements is o(y^). The connection 
between the angular synchronization problem and Max-2-Lin mod L [3] is explored 
in Section [6l Finally, Section [7] is a summary and discussion of further applications 
of the eigenvector method to other synchronization problems over different groups. 

2. The Eigenvector Method. Our approach to finding the self consistent so- 
lution for 01, . . . ,6n starts with forming the following n x n matrix H 

^--1 {^,J}^E ' (2.1) 



where t = V— T. Since 



5ij = —Sij, for all — 1, . . . ,n, (2-2) 



it follows that Hij ~ Hji, where for any complex number z ~ a + ib wc denote by 
z = a — lb its complex conjugate. In other words, the matrix H is Hermitian, i.e., 
H* = H. 

Next, we consider the maximization problem 

n 

max y e-*^'i/,,e*^% (2.3) 

and explain the philosophy behind it. For the correct set of angles ^i, . . . ,6n, each 
good edge contributes 

to the sum in (|2.3p . The total contribution of the good edges is just the sum of ones, 
piling up to be exactly the total number of good edges mgood- On the other hand, 
the contribution of each bad edge will be uniformly distributed on the unit circle in 
the complex plane. Adding up the terms due to bad edges can be thought of as a 
discrete planar random walk where each bad edge corresponds to a unit size step at a 
uniformly random direction. These random steps mostly cancel out each other, such 
that the total contribution of the mbad edges is only 0{^mbad)- It follows that the 
objective function in (|2.3p has the desired property of diminishing the contribution of 
the bad edges by a square root relative to the linear contribution of the good edges. 

Still, the maximization problem (|2.3p is a non-convex maximization problem 
which is quite difficult to solve in practice. We therefore introduce the following 
relaxation of the problem 

n 

max z*HijZj. (2.4) 

Zi, . . . , z„ G C ij=i 



That is, we replace the previous n individual constraints for each of the variables 
Zi = e'®' to have a unit magnitude, by a single and much weaker constraint, requiring 
the sum of squared magnitudes to be n. The maximization problem (j2.4[) is that of a 
quadratic form whose solution is simply given by the top eigenvector of the Hermitian 
matrix H. Indeed, the spectral theorem implies that the eigenvectors vi,V2, ■ ■ ■ ,Vn of 
H form an orthonormal basis for C" with corresponding real eigenvalues Ai > A2 > 
. . . > A„ satisfying Hvi — XiVi. Rewriting the constrained maximization problem 
(1^ as 

max z*Hz, (2.5) 

it becomes clear that the maximizer z is given by 2; = where Vi is the normalized 
top eigenvector satisfying Hvi = XiVi and HiiilP = n, with Ai being the largest eigen- 
value. The components of the eigenvector vi are not necessarily of unit magnitude, 
so we normalize them and define the estimated angles by 

e''' = ^, for^-l,...,n (2.6) 

bi(*)l 

(see also equation (|1.7p ). 

The top eigenvector can be efficiently computed by the power iteration method 
that starts from a randomly chosen vector bo and iterates 6,1+1 = \\Hb"\\ • Each 
iteration requires just a matrix-vector multiplication that takes 0{n^) operations for 
dense matrices, but only 0{m) operations for sparse matrices, where m = \E\ is the 
number of non-zero entries of H corresponding to edges in the graph. The number of 
iterations required by the power method decreases with the spectral gap that indeed 
exists and is analyzed in detail in Section |4l 

Note that cycles in the graph of good edges lead to consistency relations between 
the offset measurements. For example, if the three edges {i, j}, {j, /c}, {fc, «} are a 
triangle of good edges, then the corresponding offset angles Sij, Sjk and Ski must 
satisfy 



because 



S^J + Sjk + = mod 2tt, (2.7) 



S^J + Sjk + Skt = 61, - 0j + e.j - Ok + ek-e, = Q mod 2n. 



A closer look into the power iteration method reveals that multiplying the matrix H 
by itself integrates the information in the consistency relation of triplets, while higher 
order iterations exploit consistency relations of longer cycles. Indeed, 

n 

fc=l k:{i,k},{j.k}eE k:{i,k}.{j,k}eE 

= # {fc : {i, fc} and {j, fc} G Egood} e<'^-''^ (2.9) 



E 



e 

k:{i,k} or {j,fc}eBbad 



where we employed (|2.2p in (|2.8p . and (|2.7p in (|2.9p . The top eigenvector therefore 
integrates the consistency relations of all cycles. 
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3. The semidefinite program approach. A different natural relaxation of 
the optimization problem (|2.3p is using SDP. Indeed, the objective function in (|2.3p 
can be written as 

n 

e-^^'iyye'"^ = trace(i?e), (3.1) 

ij = l 

where Q is the n x n complex- valued rank-one Hermitian matrix 

e,y ^e'C'--^^'. (3.2) 

Note that 8 has ones on its diagonal 

e,, = 1, z = l,2,...,n. (3.3) 

Except for the non-convex rank-one constraint implied by (|3.2p . all other constraints 
are convex and lead to the natural SDP relaxation (|1.8p - p.l0|) . This program is almost 
identical to the Goemans- Williamson SDP for finding the maximum cut in a weighted 
graph. The only difference is that here we maximize over all possible complex-valued 
Hermitian matrices, not just the symmetric real matrices. The SDP-based estimator 
corresponding to (jl.8[) - (|1.10p is then obtained from the best rank-one approximation 
of the optimal matrix using the Cholesky decomposition. 

The SDP method may seem favorable to the eigenvector method as it explicitly 
imposes the unit magnitude constraint for e*^' . Our numerical experiments show that 
the two methods give similar results (see Table 14. 3p . Since the eigenvector method is 
much faster, it is also the method of choice for large scale problems. 

4. Connections with random matrix theory and spectral graph theory. 

In this section we analyze the eigenvector method using tools from random matrix 
theory and spectral graph theory. 

4.1. Analysis of the complete graph angular synchronization problem. 

We first consider the angular synchronization problem in which all (2) angle offsets 
are given, so that the corresponding graph is the complete graph Kn of n vertices. 
We also assume that the probability for each edge to be good is p, independently of 
all other edges. This probabilistic model for the graph of good edges is known as the 
Erdos-Renyi random graph G{n,p) [9]. We refer to this model as the complete graph 
angular synchronization model. 

The elements of H in the complete graph angular synchronization model are 
random variables given by the following mixture model. With probability p the edge 
{i,j} is good and Hij = e^^^~^^\ whereas with probability 1 — p the edge is bad and 
Hij ~ Uniform {S^). It is convenient to define the diagonal elements as Ha = p. 

The matrix H is Hermitian and the expected value of its elements is 

EH,j =pe^^'-'''\ (4.1) 
In other words, the expected value of H is the rank-one matrix 

EH^npzz*, (4.2) 
where z is the normalized vector (||z|| ~ 1) given by 

z, = ^e''^% i = l,...,n. (4.3) 



The matrix H can be decomposed as 



H = npzz* + i?, (4.4) 

where R = H — KH is a random matrix whose elements have zero mean, with Ra ~ 0, 
and for i ^ j 

j (1 — p)e'(^'"^3) with probabihty p , , 



e 



pgi(oi dj) .^ p^ 1 —p and ip ^ Uniform{[0,2'ir)) 



The variance of Rij is 

E|i?,,f = (1 - pfp + (1 + p2)(i _ p) = 1 _ p2 (4 

for i j and for the diagonal elements. Note that for p = 1 the variance vanishes 
as all edges become good. 

The distribution of the eigenvalues of the random matrix R follows Wigner's 
semi-circle law [40l [41] whose support is [~2y/n{l — p'^),2y/n{l — p"^)]. The largest 
eigenvalue of R, denoted Xi{R), is concentrated near the right edge of the support 
[2] and the universality of the edge of the spectrum [34] implies that it follows the 
Tracy- Widom distribution [36j even when the entries of R are non-Gaussian. For our 
purposes, the approximation 



Ai(i?)«2v/n(l-p2) (4.7) 

will suffice, with the probabilistic error bound given in [5]. 

The matrix H = npzz* + R can be considered as a rank-one perturbation to a 
random matrix. The distribution of the largest eigenvalue of such perturbed random 
matrices was investigated in |29| llll I15j for the particular case where z is propor- 
tional to the all-ones vector (1 1 • • • 1)^. Although our vector z given by (|4.3p is 
different, without loss of generality, we can assume di ^ 02 = ■ ■ ■ = 0„ = 0, since this 
assumption does not change the statistical properties of the random matrix R. Thus, 
adopting [11] Theorem 1.1] to H gives that for 



np 



> V"(l-p2) (4.8) 



the largest eigenvalue Xi{H) jumps outside the support of the semi-circle law and is 
normally distributed with mean /i and variance cr^ given by 



MiH)^m,y), p^^^ + ^^, .^^il^±l)^(l-,^), (4.9) 

V 1 — p"^ P i^P 



whereas for np < 1/71(1 — p'^), Xi{H) still tends to the right edge of the semicircle 
given at 2 1/71(1 — p'^). 

Note that the factor of 2 that appears in (|4.7p has disappeared from (|4.8|) , which 
is perhaps somewhat non-intuitive: it is expected that Xi{H) > Xi{R) whenever np > 
Ai(i?), but the theorem guarantees that Xi{H) > Xi{R) already for np > ^Xi{R). 

The condition (|4.8p also implies a lower bound on the correlation between the 
normalized top eigenvector vi of H and the vector z. To that end, consider the 
eigenvector equation satisfied by vi: 



Xi{H)vi = Hvi = {npzz* + R)vi. 
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(4.10) 



Taking the dot product with vi yields 



Xi{H)=np\{z,vi)\^ + vlRvi. (4.11) 
From v^Rvi < Xi{R) we obtain the lower bound 

M^)-M^^ (4.12) 
np 

with Xi{H) and Ai(i?) given by (|T7)) and ((^9)) . Thus, if the spectral gap Ai(i7)-Ai(i?) 
is large enough then vi must be close to z, in which case the eigenvector method 
successfully recovers the unknown angles. Since the variance of the correlation of two 
random unit vectors in M" is 1/n, the eigenvector method would give above random 
correlation values whenever 

him^M^ > 1. (4.13) 

np n 

Replacing in (|4.13p Xi{H) by ^ from (|4.9p and Xi{R) by (|4.7|) and multiplying by 
P\/n yields the condition 



-ft + ^^-2v/T^>^. (4.14) 
Since -^^= + ^^^^p^ > 2, it follows that gTll) is satisfied for 

p>^. (4.15) 

Thus, already for p > we should obtain above random correlations between the 
vector of angles z and the top eigenvector vi. We therefore define the threshold 
probability Pc as 

Pc = (4.16) 

When np ^ Ai(-R); the correlation between vi and z can be predicted by using 
regular perturbation theory for solving the eigenvector equation (|4.10p in an asymp- 
totic expansion with the small parameter e = ^^^^^ . Such perturbations are derived in 
standard textbooks on quantum mechanics aiming to find approximations to the en- 
ergy levels and eigenstates of perturbed time-independent Hamiltonians (see, e.g., [20|, 
Chapter 6]). In our case, the resulting asymptotic expansions of the non- normalized 
eigenvector vi and of the eigenvalue Xi{H) are given by 

vir^ z + — [Rz- {z*Rz)z] + ..., (4.17) 
np 

and 



Xi{H) np + z*Rz + 
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(4.18) 



Note that the first order term in (I4.17P is perpendicular to the leading order term z, 
from which it follows that the angle a between the eigenvector vi and the vector of 
true angles z satisfies the asymptotic relation 



tan^ OL 



\Rz\ 



{z*Rzf 



(4.19) 



because — {z*Rz)z\^ = 
terms in (j4.19p are given by 



— {z*RzY . The expected values of the numerator 



E||i?zf = 



^RijZj 



^ Var(i?,,z,) = |z,f (1 V) - (n-l)(l V), 

i,j = l 1=1 j^i 

(4.20) 



and 



K{z*Rzf 



E 



RijZiZj 



J2 Var(i?,,z,z,) = (l-p2)^ 

2 „ 



z,f\zA^ 



(4.21) 



where we used that Rij are i.i.d zero mean random variables with variance given by 
(gH) and that \z,\'^ = i. Substituting (g^Ol)- (11211) into (|iT^ results in 



E tan^ a 



which for p <C 1 and n 3> 1 reads 



(n- 1)2(1 -p2) 



(4.22) 



Etan a 



(4.23) 



This expression shows that as np^ goes to infinity, the angle between vi and z goes 
to zero and the correlation between them goes to 1. For np^ ^ 1, the leading order 
term in the expected squared correlation E cos^ a is given by 



Ecos^a = E- 



1 



1 



1 + tan a 



(4.24) 



1 



np'^ 



We conclude that even for very small p values, the eigenvector method successfully 
recovers the angles if there are enough equations, that is, if np^ is large enough. 

Figure H?T] shows the distribution of the eigenvalues of the matrix H for n = 400 
and different values of p. The spectral gap decreases as p is getting smaller. From 
we expect a spectral gap for p > Pc where the critical value is pc = ~ 0.05. 



The experimental values of Ai {H) also agree with (|4.9|) . For example, for n ~ 400 and 
p = 0.15, the expected value of the largest eigenvalue is = 67.28 and its standard 
deviation is cr = 0.93, while for p = 0.1 we get fi = 50.15 and a = 0.86; these value are 



in full agreement with the location of the largest eigenvalues in Figures 4.1(a)pri(b) 
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Fig. 4.1. Histogram of the eigenvalues of the matrix H in the complete graph model for n = 400 
different values of p. 



Note that the right edge of the semi-circle is smaller than 2\/n = 40, so the spectral 
gap is significant even when p = Q.l. 

The skeptical reader may wonder whether the existence of a visible spectral gap 
necessarily implies that the normalized top eigenvector vi correctly recovers the orig- 
inal set of angles 0i, . . . , 0„ (up to a constant phase). To that end, we compute the 
following two measures of correlation pi and p2 for the correlation between the vector 
of true angles z and the computed normalized top eigenvector vi: 



Pi 



n 

-E' 

n ^ 

i=l 



Vl{i) 



P2 



1 



i=l 



'Wl(j) 



(4.25) 



The correlation pi takes into account the rounding procedure (|2.6|) , while p2 is simply 
the dot product between vi and z without applying any rounding. Clearly, pi, P2 < 
1 (Cauchy-Schwartz), and pi = 1 iff the two sets of angles are the same up to a 
rotation. Note that it is possible to have pi = 1 with p2 < I- This happens when 
the angles implied by vi{i) are all correct, but the magnitudes |wi(j)| are not all 
the same. Table l4Tl summarizes the experimentally obtained correlations pi, P2 for 
different values of p with n = 100 (Table [40(a)| ) and n = 400 (Table [40(fe)| ). The 
experimental results show that for large values of np^ the correlation is very close to 1, 
indicating a successful recovery of the angles. The third column, indicating the values 

of ^1 + is motivated by the asymptotic expansion ()4.24p and seems to provide 

a very good approximation for p2 when np^ 3> 1, with deviations attributed to higher 
order terms of the asymptotic expansion and to statistical fluctuations around the 



mean value. Below the threshold probability (ending rows of Tables 4.0(a) and 4.0(b) 
with np^ < 1), the correlations take values near as expected from the correlation 

of two unit random vectors in M" — = 0.1 and — = 0.05). 

^ \/Too v^oo ' 

From the practical point of view, most important is the fact that the eigenvector 
method successfully recovers the angles even when a large portion of the offset mea- 
surements consists of just outliers. For example, for n ~ 400, the correlation obtained 
when 85% of the offset measurements were outliers (only 15% are good measurements) 
was pi = 0.97. 

4.2. Analysis of the angular synchronization problem in general. We 

turn to analyze the eigenvector method for general measurement graphs, where the 
graph of good measurements is assumed to be connected, while the graph of bad edges 
is assumed to be made of edges that are uniformly drawn from the remaining edges 
of the complete graph once the good edges has been removed from it. Our analysis is 
based on generalizing the decomposition given in (|4.4p . 
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(a) n = 100 



(b) n = 400 
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Pi 


P2 


0.4 


16 


0.97 


0.99 


0.98 


0.3 


9 


0.95 


0.97 


0.95 


0.2 


4 


0.89 


0.90 


0.88 


0.15 


2.25 


0.83 


0.75 


0.81 


0.1 


1 


0.71 


0.34 


0.35 


0.05 


0.25 


0.45 


0.13 


0.12 



P 




V "p- / 


Pi 


P2 


0.2 


16 


0.97 


0.99 


0.97 


0.15 


9 


0.95 


0.97 


0.95 


0.1 


4 


0.89 


0.90 


0.87 


0.075 


2.25 


0.83 


0.77 


0.76 


0.05 


1 


0.71 


0.28 


0.32 


0.025 


0.25 


0.45 


0.06 


0.07 



Table 4.1 

Correlations between the top eigenvector vi of H and the vector z of true angles for different 
values of p in the complete graph model. 



Let A be the adjacency matrix for the set of good edges Egood'- 

A,, = ( \ ^^■■^\%%'°°'' ■ (4.26) 

As the matrix A is symmetric, it has a complete set of real eigenvalues Ai > A2 > 
. . . > A„ and corresponding real orthonormal eigenvectors ijji, . . . ,ipn such that 



Y,>^iMT- (4.27) 



1=1 

Let Z be an n X n diagonal matrix whose diagonal elements are Za = e*^' . Clearly, 
Z is a unitary matrix (ZZ* = I). Define the Hermitian matrix B by conjugating A 
with Z 

B = ZAZ*. (4.28) 

It follows that the eigenvalues of B are equal to the eigenvalues Ai, . . . , A„ of A, and 
the corresponding eigenvectors {4>i}"^i of B, satisfying B(j)i ~ arc given by 

0i=ZV'i, ? = l,...,n. (4.29) 

From (|4?28)) it follows that 

B -[ "''"'"'^^ Kj}e£;,oo. .430. 

'■'"I {^,J]^Eg,,d ■ ^ 
We are now ready to decompose the matrix H defined in (|2.ip as 

H^B + R, (4.31) 
where i? is a random matrix whose elements are given by 

% = ( C f'i'^f^' . (4-32) 

where Sij ~ Uniform{[0, 27r) for {i,j} € Ehad- The decomposition (|4.3ip is extremely 
useful, because it sheds light into the eigen-structurc of H in terms of the much simpler 
eigen-structurcs of B and R. 
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First, consider the matrix B defined in (j4.28p . which shares the same spectrum 
with A and whose eigenvectors 0i , . . . , 0„ are phase modulations of the eigenvectors 
Tpi, . . . ,ijjn of A. If the graph of good measurements is connected, as it must be in 
order to have a unique solution for the angular synchronization problem (see second 
paragraph of Section [T|), then the Perron-Frobenius theorem (see, e.g., [22l Chapter 
8]) for the non- negative matrix A implies that the entries of are all positive 

ipi{i)>0, for alH = 1,2,.. .,n, (4.33) 

and therefore the complex phases of the coordinates of the top eigenvector (pi = Ztpi of 
B are identical to the true angles, that is, e*^' = . Hence, if the top eigenvector of 

H is highly correlated with the top eigenvector of B, then the angles will be estimated 
with high accuracy. We will shortly derive the precise condition that guarantees such 
a high correlation between the eigenvectors of H and B. 

The spectral gap Agood of the good graph is the difference between its first and 
second eigenvalues, i.e., Agood ~ ^li^) ~ X2{A). The Perron-Frobenius theorem and 
the connectivity of the graph of good measurements also imply that Agood > 0. 

Next we turn to analyze the spectrum of the random matrix R given in (|4.32p . We 
assume that the nibad bad edges were drawn uniformly at random from the remaining 
edges of the complete graph on n vertices that are not already good edges. There are 
only 2mbad nonzero elements in i?, which makes R a sparse matrix with an average 
number of 2m.bad/n nonzero entries per row. The nonzero entries of R have zero mean 
and unit variance. The spectral norm of such sparse random matrices was studied in 
[25l[24j where it was shown that with probability 1, 

limsup— ^=Ai(i?) < 2 

n-*oo \J2m\yad 

as long as ^^"''^ — > oo as n — + oo. The implication of this result is that we can 
approximate Ai(i?) with 



Ai(i?)«2^^?^. (4.34) 



Similar to the spectral gap condition (|4.8[) . requiring 



Agood > \\i{R). (4.35) 

ensures that with high probability, the top eigenvector of B would be highly correlated 
with the top eigenvector of B. Plugging ()4.34|) into ()4.35|) . we get the condition 



Agood > ^— ■ (4.36) 

We illustrate the above analysis for the small world graph, starting with a neigh- 
borhood graph on the unit sphere with n vertices corresponding to points on the 
sphere and m edges, and rewiring each edge with probability 1—p at random, resulting 
shortcut edges. The shortcut edges are considered as bad edges, while unperturbed 
edges are the good edges. As the original m edges of the small world graph are rewired 
with probability the expected number of bad edges Kmbad and the expected 

number of good edges Kmgood are given by 



E^good = pm, Embad = (1 - p)^, 
12 



(4.37) 



with relatively small fluctuations of 0(-\/mp(l — p)). 

The average degree of the original unperturbed graph is d = Assuming 
uniform sampling of points on the sphere, it follows that the average area of the 
spherical cap covered by the neighboring points is 47r-^ = The average opening 

angle rj corresponding to this cap satisfies 27r(l — cosr/) = or 1 — cost; = 

Consider the limit m, n — > oo while keeping the ratio c = 4m/n^ constant. By the law 
of large numbers, the matrix converges in this limit to the integral convolution 
operator JC on (see, e.g., given by 



= Lx[i-c,i]((/3,/3'»./(/3')rfV> (4.38) 



where xi is the characteristic function of the interval /. 

The classical Funk-Hecke theorem (see, e.g., [5S1 p. 195]) asserts that the spherical 
harmonics are the eigenfunctions of convolution operators over the sphere, and the 
eigenvalues A; are given by 



MIC) = 



and have multiplicities 21 + 1 (i = 0, 1, 2, . . .), where Pi are the Legendre polynomials 
{Po{t) = l,Pi(t) = t,...). In particular, Ao(/C) = ^, Ai(/C) = f (1 - ^c), and the 

2 

spectral gap of JC is A(/C) = The spectral gap of A is approximately 

4 n'^ 

Plugging and into yields the condition 

which is satisfied for p > pc, where pc is the threshold probability 



We note that this estimate for the threshold probability is far from being tight and 
can be improved in principle by taking into account the entire spectrum of the good 
graph rather than just the spectral gap between the top eigenvalues, but we do not 
attempt to derive tighter bounds here. 

We end this section by describing the results of a few numerical experiments. 
Figure 14.21 shows the histogram of the eigenvalues of the matrix H for small-world 
graphs on 5^ . Each graph was generated by sampling n points /3i , . . . , /3„ on the 
unit sphere in from the uniform distribution as well as n random rotation 
angles 0i, . . . ,6n uniformly distributed in [0, 27r). An edge between i and j exists iff 
{Pi, Pj) > 1 — £, where e is a small parameter that determines the connectivity (average 
degree) of the graph. The resulting graph is a neighborhood graph on S^. The small 
world graphs were obtained by randomly rewiring the edges of the neighborhood 
graph. Every edge is rewired with probability 1 — p, so that the expected proportion 
of good edges is p. 
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The histograms of Figure 14.21 for the eigenvalues of H seem to be much more 
exotic than the ones obtained in the complete graph case shown in Figure 14.11 In 
particular, there seems to be a long tail of large eigenvalues, rather than a single 
eigenvalue that stands out from all the others. But now we understand that these 
eigenvalues are nothing but the top eigenvalues of the adjacency matrix of the good 
graph, related to the spherical harmonics. This behavior is better visible in Figure 
431 



irn {EJ '{SJ 

10 20 40 60 -20 20 40 -20 -10 10 20 3 



(a) p = 1 



(b) p = 0.7 



(c) p = 0.4 




-10 10 20 



(d) p = 0.1 



Fig. 4.2. Histogram of the eigenvalues of the matrix H in the small-world model for n = 400, 
£ = 0.2, m Si 8000, and different values of p. 




(a)p=l (b)p = 0.2 (c)p = 0.1 (d)p = 0.05 



Fig. 4.3. Bar plot of the 25 largest eigenvalues of the matrix H in the small-world model for 
n = 4000, e = 0.2, m ^ 8 - 10^, and different values of p. The multiplicities 1, 3, 5, 7, 9 corresponding 
to the spherical harmonics are evident as long as p is not too small. As p decreases the high- 
oscillatory spherical harmonics are getting "swallowed" by the semi-circle. 



The experimental correlations given in Table [4?2l indicate jumps in the correlation 
values that occur between p ~ 0.15 and p = 0.2 for n — 100 and between p = 0.1 
and p = 0.12 for n = 400. The experimental threshold values seem to follow the law 
Pc ~ that holds for the complete graph case ()4.15p with m = (2) . As mentioned 

earlier. (|4.4ip is a rather pessimistic estimate of the threshold probability. 

Also evident from Table l4?2l is that the correlation goes to 1 as 2mp^ /n oo. We 
remark that using regular perturbation theory and the relation of the eigenstructure 
of B to the spherical harmonics, it should be possible to obtain an asymptotic series 
for the correlation in terms of the large parameter 2mp^/n, similar to the asymptotic 
expansion (|4.24p . 

The comparison between the eigenvector and SDP methods (as well as the least 
squares method of Section [T]) is summarized in Table 14.31 showing the numerical cor- 
relations for n = 200, e = 0.3 (number of edges m k, 3000) and for different values of 
p. Although the SDP is slightly more accurate, the eigenvector method runs faster. 

5. Information Theoretic Analysis. The optimal solution to the angular syn- 
chronization problem can be considered as the set of angles that maximizes the log- 
likelihood. Unfortunately, the log-likelihood is a non-convex function and the maxi- 
mum likelihood cannot be found in a polynomial time. Both the eigenvector method 
and the SDP method are polynomial-time relaxations of the maximum log-likelihood 
problem. In the previous section we showed that the eigenvector method fails to re- 
cover the true angles when p is below the threshold probability p"^. It is clear that 
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(a) n = 100, e = 0.3, m Ri (b) n = 400, e = 0.2, m » 

750 8000 



p 


2mp^ 
n 


Pi 




P 


2mp-' 
n 


Pi 


0.8 


9.6 


0.923 




0.8 


26 


0.960 


0.6 


5.4 


0.775 




0.4 


6.4 


0.817 


0.4 


2.4 


0.563 




0.3 


3.6 


0.643 


0.3 


1.4 


0.314 




0.2 


1.6 


0.282 


0.2 


0.6 


0.095 




0.1 


0.4 


0.145 



Table 4.2 

Correlations between the top eigenvector of H and the vector of true angles for different values 
of p in the small-world model. 



p 


Plsqr 


Peig 


Psdp 


rankO 


1 


1 


1 


1 


1 


0.7 


0.787 


0.977 


0.986 


1 


0.4 


0.046 


0.839 


0.893 


3 


0.3 


0.103 


0.560 


0.767 


3 


0.2 


0.227 


0.314 


0.308 


4 


0.15 


0.091 


0.114 


0.102 


5 



Table 4.3 



Comparison between the correlations obtained by the eigenvector method Peig, by the SDP 
method psdp o.'f^d, by the least squares method pi^qr for different values of p (small world graph on 
, n = 200, e = 0.3, m ?s 3000j. The SDP tends to find low-rank matrices despite the fact that the 
rank-one constraint on is not included in the SDP. The rightmost column gives the rank of the 
matrices that were found by the SDP. To solve the SDP f iOI) we used SDPLR, a package for 

solving large-scale SDP problems {3f. The least squares solution was obtained using MATLAB's Isqr 
function. As expected, the least squares method yields poor correlations compared to the eigenvector 
and the SDP methods. 



even the maximum likelihood solution would fail to recover the correct set of angles 
below some (perhaps lower) threshold. It is therefore natural to ask if the threshold 
value of the polynomial eigenvector method gets close to the optimal threshold value 
of the exponential-time maximum likelihood exhaustive search. In this section we pro- 
vide a positive answer to this question using the information theoretic Shannon bound 
[S] . Specifically, we show that the threshold probability for the eigenvector method is 
asymptotically larger by just a multiplicative factor compared to the threshold prob- 
ability of the optimal recovery algorithm. The multiplicative factor is a function of 
the angular discretization resolution, but not a function of n and m. The eigenvector 
method becomes less optimal as the discretization resolution improves. 

We start the analysis by recalling that from the information theoretic point of 
view, the uncertainty in the values of the angles is measured by their entropy. The 
noisy offset measurements carry some bits of information on the angle values, therefore 
decreasing their uncertainty, which is measured by the conditional entropy that we 
need to estimate. 

The angles ^i, . . . , 0„ can take any real value in the interval [0, 27r). However, an 
infinite number of bits is required to describe real numbers, and so we cannot hope to 
determine the angles with an arbitrary precision. Moreover, the offset measurements 
are often also discretizcd. We therefore seek to determine the angles only up to some 
discretization precision where L is the number of subintervals of [0, 2n) obtained 
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by dividing the unit circle is into L equally sized pieces. 

Before observing any of the offset measurements, the angles are uniformly dis- 
tributed on {0, 1, . . . , L — 1}, that is, each of them falls with equal probability 1/L to 
any of the L subintervals. It follows that the entropy of the i'th angle 9i is given by 

L-l ^ ^ 

H{e,) = -Y,l^og^l=^og^L, for^ = l,2,...,n. (5.1) 

1=0 

We denote hy 9" — {9i, ... ,6n) the vector of angles. Since 6i, . . . ,9n are independent 
(the orientations of the molecules arc random), their joint entropy H{9"') is given by 

n 

H{d^) = Y,H{9,)^n\og,L, (5.2) 
1=1 

reflecting the fact that the configuration space is of size L" — 2"'°S2 ^. 

Let Sij be the random variable for the outcome of the noisy offset measure- 
ment of 6i and 9j. The random variable Sij is also discretized and takes values in 
{0, 1, . . . , L — 1}. We denote by (5™ = {Siiji,- ■ ■ the vector of all offset mea- 

surements. Conditioned on the values of 6i and dj, the random variable Sij has the 
following conditional probability distribution 

because with probability 1 — p the measurement Sij is an outlier that takes each 
of the L possibilities with equal probability and with probability p it is a good 
measurement that equals di — 9j. It follows that the conditional entropy H{Sij\9i, 9j) 
is 

H{S,,\9,, 9,) ^-{L- l)i^ log, ^-(p+ ^) log2 (p + ^) . (5.4) 
We denote this entropy by H{L,p) and its deviation from log, L by I{L,p), that is, 
HiL,p) ^-{L- 1)^ log, ^ - (p + ^) log2 {p + ^) . (5.5) 

and 

I{L,p) = \og^L~H{L,p). (5.6) 

Without conditioning, the random variable 5ij is uniformly distributed on {0, . . . , L — 
1} and has entropy 

H{5,,) ^ log., L. (5.7) 

It follows that the mutual information I{Sij;9i,9j) between the offset measurement 
Sij and the angle values 9i and 9j is 

/(<5„; a„ 9,) = H{S,j) - H{S,j\9,, 6,) = log, L ~ H{L,p) = 1(1, p). (5.8) 
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This mutual information measures the reduction in the uncertainty of the random 
variable 6ij from knowledge of 6i and 9j. Due to the symmetry of the mutual infor- 
mation, 

/(<5„;0„%) = - H{5,,\e,,0j) = H{9,,e,) - (5.9) 

the mutual information is also the reduction in uncertainty of the angles 6i and 6j 
given the noisy measurement of their offset 6ij. Thus, 

H{e.„ ej\6,j) = Hi9,,e,) - /(<5,;,; 0., Oj). (5.10) 

Similarly, given all m offset measurements (5™, the uncertainty in 0" is given by 

F(0"|^™) = iJ(0") - I{S"'; 0"), (5.11) 

with 

/(^"; 6/") = if(^") - F(J™|6»"). (5.12) 

A simple upper bound for this mutual information is obtained by explicit evaluation of 
the conditional entropy iJ((5"'|0") combined with a simple upper bound on the joint 
entropy term H{S"^). First, note that given the values of 0i, . . . , 6'„, the offsets become 
independent random variables. That is, knowledge of Si-^j-^ (given 0i-^ , Oj-^ ) does not give 
any new information on the value of (5^2 (given 6i^^ ^ja)- The conditional probability 
distribution of the offsets is completely determined by (|5.3p . and the conditional 
entropy is therefore the sum of m identical entropies of the form (j5.4p 

if(J"|0") =mi/(L,p). (5.13) 

Next, bounding the joint entropy H{d"^) by the logarithm of its configuration space 
size L™ yields 

i?(^'") < TOl0g2i. (5.14) 

Note that this simple upper bound ignores the dependencies among the offsets which 
we know to exist, as implied, for example, by the triplet consistency relation (j2.7p . As 
such, (|5.14p is certainly not a tight bound, but still good enough to prove our claim 
about the nearly optimal performance of the eigenvector method. 

Plugging (|5.13p and (|5.14[) in (|5.12[) yields the desired upper bound on the mutual 
information 

/((5™; 6>") < mlog2 L - mH{L,p) = mI{L,p). (5.15) 

Now, substituting the bound (|5.15p and the equality (|5.2|) in (|5.11|) gives a lower 
bound for the conditional entropy 

H(6»"|6") > nlog2i-m/(L,p). (5.16) 

We may interpret this bound in the following way. Before seeing any offset measure- 
ment the entropy of the angles is n\og2 L, and each of the m offset measurements can 
decrease the conditional entropy by at most I{L,p), the information that it carries. 

The bound (|5.16p demonstrates, for example, that for fixed n, p and L, the 
conditional entropy is bounded from below by a linear decreasing function of m. It 
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follows that unless m is large enough, the uncertainty in the angles would be too large. 
Information theory says that a successful recovery of all , . . . , 0^ is possible only 
when their uncertainty, as expressed by the conditional entropy, is small enough. The 
last statement can be made precise by Fano's incciuality and Wolfowitz' converse, also 
known as the weak and strong converse theorems to the coding theorem that provide 
a lower bound for the probability of the error probability in terms of the conditional 
entropy, see, e.g., [3 Chapter 8.9, pages 204-207] and [HI Chapter 5.8, pages 173-176]. 

In the language of coding, we may think of 0" as a codeword that we are trying 
to decode from the noisy vector of offsets 5^ which is probabilistically related to 0". 
The codeword 0" is originally uniformly distributed on {1,2,..., 2"''°S2 ^} and from 
5™ we estimate 0" as one of the 2"'°S2^ possibilities. Let the estimate be 6 and 
define the probability of error as Pe = Vi{6 ^ 0"}. Fano's inequality [3 Lemma 
8.9.1, page 205] gives the following lower bound on the error probability 

iJ(0"|<5'") < l + Pen\og^L. (5.17) 

Combining (j5.17|) with the lower bound for the conditional entropy ()5.16p we obtain 
a weak lower bound on the error probability 

mI{L,p) 1 

Pe>l--^^'-, T. (5.18) 

n logjL niog^L 

This lower bound for the probability of error is applicable to all decoding algorithms, 
not just for the eigenvector method. For large n, we see that for any /? < 1, 

!^{^</3=^P,>l_/3 + o(l). (5.19) 

We are mainly interested in the limit m,n —f oo and p — s- with L being fixed. The 
Taylor expansion of I{L,p) (given by (|5.5p - (|5.6p ') near p — reads 

I{L,p)^^{L-l)p^ + 0{p'). (5.20) 
Combining (|5.19p and (|5.20p we obtain that 



n 2 log2 L 



m{L-l) 



f3 =^ Pe>l- f3 + o{l), as n,m ^ oo, n/m ^ 0. (5.21) 



Note that n/m 0, because m > nlogn in order to ensure with high probability the 
connectivity of the measurement graph G. The bound (|5.2ip was derived using the 
weak converse theorem (Fano's inequality). It is also possible to show that the prob- 
ability of error goes exponentially to 1 (using the Wolfowitz' converse and Chcrnoff 
bound, see [HI Theorem 5.8.5, pages 173-176]). 

The above discussion shows that there does not exist a decoding algorithm with 
a small probability for the error for values of p below the threshold probability p™^ 
given by 



V m L — 1 

Note that for L = 2, the threshold probability p^*^ = ^ of the eigenvector method 
in the complete graph case for which m ~ (2) is 2 times smaller than p"''^ . This is 
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not a violation of information theory, because the fact that the top eigenvector has 
a non-trivial correlation with the vector of true angles does not mean that all angles 
are recovered correctly by the eigenvector. The fact that the eigenvector method 
gives non-trivial correlations below the information theoretic bound is just another 
evidence to its effectiveness. 

We turn to shed some light on why it is possible to partially recover the angles 
below the information theoretic bound. The main issue here is that it is perhaps too 
harsh to measure the success of the decoding algorithm by = Pr{^ ^ 6"}. For 
example, when the decoding algorithm decodes 999 angles out of n = 1000 correctly 
while making just a single mistake, we still count it as a failure. It may be more 
natural to consider the probability of error in the estimation of the individual angles. 
We proceed to show that this measure of error leads to a threshold probability which 
is smaller than (|5.22p by just a constant factor. 

Let P]^' = Pr{^i ^61} be the probabihty of error in the estimation of 9i . Again, 
we want to use Fano's inequality to bound the probability of the error by bounding 
the conditional entropy. A simple lower bound to the conditional entropy H{9i\S"^) 
is obtained by conditioning on the remaining n — 1 angles 

> 02,9,, . . . , 0„). (5.23) 

Suppose that there are di noisy offset measurements of the form 9i — 9j, that is, di 
is the degree of node 1 in the measurement graph G. Let the neighbors of node 1 
be ji, j2, . . • ,jdi with corresponding offset measurements Sij-^, . . . , Sij^_^ . Given the 
values of all other angles 62, ■ ■ ■ ,9n, and in particular the values of flj^ , . . . , , these 
di equations become noisy equations for the single variable 9i. We denote these 
transformed equations for 9i alone by 5i , . . . , S^^ . All other m — di equations do not 
involve 61 and therefore do not carry any information on its value. It follows that 

H{9i\5"\ 92,93,...,9n)^ H{9i |d\ , . . . , 5^ J. (5.24) 

We have 

H{5i,...,~5dA9i)=dxH{L,p), (5.25) 

because given 9i these di equations are i.i.d random variables with entropy H(L,p). 
Also, a simple upper bound on the di equations (without conditioning) is given by 

HCSi,...,~Sd,) <di\og2L, (5.26) 

ignoring possible dependencies among the outcomes. From (|5.25p - (|5.26[) we get an 
upper bound for the mutual information between 9i and the transformed equations 

/(0i; Ji, ...,~6d,)<di [log2 L - H{L,p)] = diI{L,p). (5.27) 

Combining ([QS]) . (fSTM]) ([5?27l) and jSH]) we get 

H{ei\8'^) > H{9^\8^\92, 03, ... , On) (5.28) 
= H{9i\5i,...M 
= Hi9i)-I{9,;Si,...,Sd,) 

> logai - diI{L,p). 
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This lower bound on the conditional entropy translates, via Fano's inequality, to a 
lower bound on the probability of error Pj^', and it follows that 

diI{L,p)>\Qg^L (5.29) 

is a necessary condition for having a small Pi^'. Similarly, the condition for a small 
probability of error in decoding 9i is 

d,/(i,p) > log2i, (5.30) 

where di is the degree of vertex i in the measurement graph. This condition suggests 
that we should have more success in decoding angles of high degree. The average 
degree d in a graph with n vertices and m edges is c? = The condition for 

successful decoding of angles with degree d is 

2m 

—I{L,p)>\og^L. (5.31) 
n 

In particular, this would be the condition for all vertices in a regular graph, or in a 
graph whose degree distribution is concentrated near d. 

Substituting the Taylor expansion (|5.20p into (|5.3ip results in the condition 



nlog^L 

P>\—-? 7- (^-^2) 

This means that successful decoding of the individual angles may be possible already 
for p > p^J^''-, where 



pT'=J--^, (5.33) 
\ m L — 1 

but the estimation of the individual angles must contain some error when p < p™''. 
Note that p™"^ < p^J^^, so while for p values between p™*^ and p'^J^f it is impossible to 
successfully decode all angles, it may still be possible to decode some angles. 

In the complete graph case, comparing the threshold probability of the eigenvector 
method p^*^ = ^ given by (|4.15[) and the information theoretic threshold probability 

p™'^ (|5.33[) below which no algorithm can successfully recover individual angles, we 
find that their ratio is asymptotically independent of n and m: 



^ = . r + o(l). (5.34) 

pvnd Y 21og2L ^ ' ^ ' 

Note that the threshold probability p^'^ is smaller than p™'' for L < 6. Thus, we 
may regard the eigenvector method as a very successful recovery algorithm for offset 
equations with a small modulo L. 

For L > 7, equation (|5.34p implies a gap between the threshold probabilities p^'^ 
and p™**, suggesting that the exhaustive exponential search for the maximum like- 
lihood would perform better than the polynomial time eigenvector method. Note, 
however, that the gap would be significant only for very large values of L that corre- 
spond to very fine angular resolutions. For example, even for L = 100 the threshold 



probability of the eigenvector method would only be y Tk^~njo ~ 2.73 times larger 

than that of the maximum likelihood. The exponential complexity of 0{mL") of the 
exhaustive search for the maximum likelihood makes it impractical even for moderate- 
scale problems. On the other hand, the eigenvector method has a polynomial running 
time and it can handle large scale problems with relative ease. 
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6. Connection with Max-2-Lin mod L and Unique Games. The angular 
synchronization problem is related to the combinatorial optimization problem Max-2- 
LiN MOD L for maximizing the number of satisfied linear equations mod L with exactly 
2 variables in each equation, because the discretized offset equations Oi -~ 9j — Sij 
mod L are exactly of this form. Max-2-Lin mod L is a problem mainly studied in 
theoretical computer science, where we prefer using the notation "mod i" instead of 
the more common "MOD p" , to avoid confusion between the size of the modulus and 
the proportion of good measurements. 

Note that a random assignment of the angles would satisfy a fraction of the 
offset equations. Andersson, Engebretsen, and Hastad [3] considered SDP based al- 
gorithms for Max-2-Lin mod L, and showed that they could obtain an (1 + k{L))- 
approximation algorithm, where k{L) > is a constant that depends on L. In par- 
ticular, they gave a very weak proven performance guarantee of -I- 10~^), though 
they concluded that it is most likely that their bounds can be improved significantly. 
Moreover, for L = 3 they numerically find the approximation ratio to be w 0.79, 
and later Goemans and Williamson [12] proved a 0.793733- approximation. The SDP 
based algorithms in [5] are similar in their formulation to the SDP based algorithm 
of Frieze and Jerrum for MAX-k-CuT [14], but with a different rounding procedure. 
In these SDP models, L vectors are assigned to each of the n angle variables, so that 
the total number of vectors is nL. The resulting nL x nL matrix of inner products 
is required to be semidefinitc positive, along with another set of 0{n^L^) linear and 
inequality constraints. Due to the large size of the inner product matrix and the 
large number of constraints, our numerical experiments with these SDP models were 
limited to relatively small size problems (such as n = 20 and L = 7) from which it was 
difficult to get a good understanding of their performance. In the small scale problems 
that we did manage to test, we did not find any supporting evidence that these SDP 
algorithms perform consistently better than the eigenvector method, despite their ex- 
tensive running times and memory requirements. For our SDP experiments we used 
the software SDPT3 [35l|32| and SDPLR [5j in MATLAB. In [3] it is also shown that it is 
NP-hard to approximate Max-2-Lin mod L within a constant ratio, independent of 
L. Thus, we should expect an L-dependent gap similar to (|5.34|1 for any polynomial 
time algorithm, not just for the eigenvector method. 

Max-2-Lin is an instance of what is known as unique games [10], described 
below. One distinguishing feature of the offset equations is that every constraint 
corresponds to a bijection between the values of the associated variables. That is, for 
every possible value of 9i, there is a unique value of 9j that satisfies the constraint 
9i — 9j = 6ij. Unique games are systems of constraints, a generalization of the offset 
equations, that have this uniqueness property, so that every constraint corresponds 
to some permutation. 

As in the setting of offset equations, instances of unique games where all con- 
straints are satisfiable are easy to handle. Given an instance where 1 — £ fraction of 
constraints are satisfiable, the Unique Games Conjecture (UGC) of Khot [26] says 
that it is hard to satisfy even a 7 > fraction of the constraints. The UGC has 
been shown to imply a number of inapproximability results for fundamental problems 
that seem difficult to obtain by more standard complexity assumptions. Note that in 
our angular synchronization problem the fraction of constraints that are satisfiable is 

Charikar, Makarychcv and Makarychev [Sj presented improved approximation 
algorithms for unique games. For instances with domain size L where the optimal 
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solution satisfies 1 — e fraction of all constraints, their algorithms satisfy roughly 
7j-e/(2-e) ajjfj I _ 0{y/e log L) fraction of all constraints. Their algorithms are based 
on SDP, also with an underlying inner products matrix of size nL x nL, but their 
constraints and rounding procedure are different than those of [J • Given the results of 
[27] . the algorithms in [6] are near optimal if the UGC is true, that is, any improvement 
(beyond low order terms) would refute the conjecture. We have not tested their SDP 
based algorithm in practice, because, like the SDP of [3] it is also expected to be 
limited to relatively small scale problems. 

7. Summary and Further Applications. In this paper we presented an eigen- 
vector method and an SDP approach for solving the angular synchronization problem. 
We used random matrix theory to prove that the eigenvector method finds an accurate 
estimate for the angles even in the presence of a large number of outlier measurements. 

The idea of synchronization by eigenvectors can be applied to other problems 
exhibiting a group structure and noisy measurements of ratios of group elements. In 
this paper we specialized the synchronization problem over the group 80(2). In the 
general case we may consider a group G other than SO (2) for which we have good 
and bad measurements gij of ratios between group elements 

9i] = 9t9]~^^ 9i,9] e G. (7.1) 

For example, in the general case, the triplet consistency relation ()2.7|1 simply reads 

9i]9]k9kz = 9i9]~^9j9k^^9k9r^ = e, (7.2) 

where e is the identity element of G. 

Whenever the group G is compact and has a complex or real representation (for 
example, the rotation group SO (3) has a real representation using 3x3 rotation 
matrices), we may construct an Hermitian matrix that is a matrix of matrices: the ij 
element is either the matrix representation of the measurement gij or the zero matrix 
if there is no direct measurement for the ratio of Qi and ■ Once the matrix is formed, 
once can look for its top eigenvectors (or SDP) and estimate the group elements from 
them. 

In some cases the eigenvector and the SDP methods can be applied even when 
there is only partial information for the group ratios. This problem arises naturally 
in the determination of the three-dimensional structure of a macromolecule in cryo- 
electron microscopy [12]. In [32] we show that the common lines between projection 
images give partial information for the group ratios between elements in SO (3) that 
can be estimated accurately using the eigenvector and SDP methods. In [33] we 
explore the close connection between the angular synchronization problem and the 
class averaging problem in cryo-electron microscopy [12j . Other possible applications 
of the synchronization problem over SO (3) include the distance geometry problem in 
NMR spectroscopy [H] dT] and the localization of sensor networks [H |3I] . 

The eigenvector method can also be applied to non-compact groups that can be 
"compactified" . For example, consider the group of real numbers R with addition. 
One may consider the synchronization problem of clocks that measure noisy time 
differences of the form 

ti — tj = tij, ti,tj € M. (7.3) 

We compactify the group R by mapping it to the unit circle t ^ e*"* , where w G R is 
a parameter to be chosen not too small and not too large, as we now explain. There 
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may be two kinds of measurement errors in (j7.3p . The first kind of error is a small 
discretization error (e.g.. a small Gaussian noise) of typical size A. The second type of 
error is a large error that can be regarded as an outlier. For example, in some practical 
application an error of size lOA may be considered as an outlier. We therefore want 
uj to satisfy uj ^ (1/10)A~^ (not too small) and lu <C A^^ (not too large), so that 
when constructing the matrix 



each good equation will contribute approximately 1, while the contribution of the 
bad equations will be uniformly distributed on the unit circle. One may even try 
several different values for the "frequency" u in analogy to the Fourier transform. 
An overdetermined linear system of the form (|7.3p can also be solved using least 
squares, which is also the maximum likelihood estimator if the measurement errors are 
Gaussian. However, in the many outliers model, the contribution of outlier equations 
will dominate the sum of squares error. For example, each outlier equation with error 
lOA contributes to the sum of squares error the same as 100 good equations with 
error A. The compactification of the group combined with the eigenvector method 
has the appealing effect of reducing the impact of the outlier equations. This may 
open the way for the eigenvector method based on (|7.4p to be useful for the surface 
reconstruction problems in computer vision [13|, [1] and optics j30| in which current 
methods succeed only in the presence of a limited number of outliers. 
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